Siphon Overview

Unidata Python Workshop


Overview:

  • Teaching: 15 minutes
  • Exercises: 15 minutes

Questions

  1. What is a THREDDS Data Server (TDS)?
  2. How can I use Siphon to access a TDS?

Objectives

  1. Use siphon to access a THREDDS catalog
  2. Find data within the catalog that we wish to access
  3. Use siphon to perform remote data access

1. What is THREDDS?

  • Server for providing remote access to datasets
  • Variety of services for accesing data:
    • HTTP Download
    • Web Mapping/Coverage Service (WMS/WCS)
    • OPeNDAP
    • NetCDF Subset Service
    • CDMRemote
  • Provides a more uniform way to access different types/formats of data

THREDDS Catalogs

  • XML descriptions of data and metadata
  • Access methods
  • Easily handled with siphon.catalog.TDSCatalog

In [ ]:
from datetime import datetime, timedelta
from siphon.catalog import TDSCatalog
date = datetime.utcnow() - timedelta(days=1)
cat = TDSCatalog('http://thredds.ucar.edu/thredds/catalog/nexrad/level3/'
                 f'N0Q/LRX/{date:%Y%m%d}/catalog.xml')

Top


2. Filtering data

We could manually figure out what dataset we're looking for and generate that name (or index). Siphon provides some helpers to simplify this process, provided the names of the dataset follow a pattern with the timestamp in the name:


In [ ]:
request_time = date.replace(hour=18, minute=30, second=0, microsecond=0)
ds = cat.datasets.filter_time_nearest(request_time)
ds

We can also find the list of datasets within a time range:


In [ ]:
datasets = cat.datasets.filter_time_range(request_time, request_time + timedelta(hours=1))
print(datasets)

Exercise


In [ ]:
# YOUR CODE GOES HERE

Solution


In [ ]:
# %load solutions/datasets.py

Top


3. Accessing data

Accessing catalogs is only part of the story; Siphon is much more useful if you're trying to access/download datasets.

For instance, using our data that we just retrieved:


In [ ]:
ds = datasets[0]

We can ask Siphon to download the file locally:


In [ ]:
ds.download()

In [ ]:
import os; os.listdir()

Or better yet, get a file-like object that lets us read from the file as if it were local:


In [ ]:
fobj = ds.remote_open()
data = fobj.read()
print(len(data))

This is handy if you have Python code to read a particular format.

It's also possible to get access to the file through services that provide netCDF4-like access, but for the remote file. This access allows downloading information only for variables of interest, or for (index-based) subsets of that data:


In [ ]:
nc = ds.remote_access()

By default this uses CDMRemote (if available), but it's also possible to ask for OPeNDAP (using netCDF4-python).


In [ ]:
print(list(nc.variables))

Top